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Abstract 

In this work, an explicit wiretap coding scheme based on polar lattices is proposed to achieve the secrecy capacity 
of the additive white Gaussian noise (AWGN) wiretap channel. Firstly, polar lattices are used to construct secrecy- 
good lattices for the mod-A s Gaussian wiretap channel. Then we propose an explicit shaping scheme to remove 
this mod-A s front end and extend polar lattices to the genuine Gaussian wiretap channel. The shaping technique is 
based on the lattice Gaussian distribution, which leads to a binary asymmetric channel at each level for the multilevel 
lattice codes. By employing the asymmetric polar coding technique, we construct an AWGN-good lattice and a 
secrecy-good lattice with optimal shaping simultaneously. As a result, the encoding complexity for the sender and the 
decoding complexity for the legitimate receiver are both 0(N log N log(log N)). The proposed scheme is proven to 
be semantically secure. 


I. Introduction 

Wyner Jl] introduced the wiretap channel model and showed that both reliability and confidentiality could be 
attained by coding without any key bits if the channel between the sender and the eavesdropper (wiretapper’s channel 
W) is degraded with respect to the channel between the sender and the legitimate receiver (main channel V). The 
goal of wiretap coding is to design a coding scheme that makes it possible to communicate both reliably and securely 
between the sender and the legitimate receiver. Reliability is measured by the decoding error probability for the 
legitimate user, namely lim Pr{ M / M} = 0, where N is the length of transmitted codeword, M is the confidential 

N-t oo 

message and M is its estimation. Secrecy is measured by the mutual information between M and the signal received 
by the eavesdropper Z x . In this work, we will follow the strong secrecy condition proposed by Csiszar ID, i.e., 
lim 7(M; Z^l) = 0, which is more widely accepted than the weak secrecy criterion lim i/(M; Z v l) = 0. In 

N-y oo N-y oo JV 

simple terms, the secrecy capacity is defined as the maximum achievable rate under both the reliability and strong 
secrecy conditions. When W is degraded with respect to V, the secrecy capacity is given by C( V) — C(W) lOI . 
where C(-) denotes the channel capacity. 
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In the study of strong secrecy, plaintext messages are often assumed to be random and uniformly distributed. 
From a cryptographic point of view, it is crucial that the security does not rely on the distribution of the message. 
This issue can be resolved by using the standard notion of semantic security 0 which means that, asymptotically, 
it is impossible to estimate any function of the message better than to guess it without accessing Z AV 1 at all. The 
relation between strong secrecy and semantic security was recently revealed in 0, 0, namely, semantic security 
is equivalent to achieving strong secrecy for all distributions pu of the plaintext messages: 

lim max J(M; Z^) = 0. (1) 

iV->oo pm 



Fig. 1. The Gaussian wiretap channel. 


In this work, we construct lattice codes for the Gaussian wiretap channel (GWC) which is shown in Fig. □ The 
confidential message M drawn from the message set A4 is encoded by the sender (Alice) into an IV-dimensional 
codeword X^. The outputs Y^ A ' and Z v received by the legitimate receiver (Bob) and the eavesdropper Eve are 
respectively given by 

( yim = X [N] + W fV] 

j z [*] = xM+wM, 

where w[' V ' and are IV-dimensional Gaussian noise vectors with zero mean and variance of, a1 respectively. 
The channel input X N satisfies the power constraint P s , i.e., 

^[IIX^H 2 ]^^. 

Polar codes 0 have shown their great potential in solving the wiretap coding problem. The polar coding scheme 
proposed in 0, combined with the block Markov coding technique 0, was proved to achieve the strong secrecy 
capacity when W and V are both binary-input symmetric channels, and W is degraded with respect to V. More 
recently, polar wiretap coding has been extended to general wiretap channels (not necessarily degraded or symmetric) 
in ED and HD- For continuous channels such as the GWC, there also has been notable progress in wiretap lattice 
coding. On the theoretical aspect, the existence of lattice codes achieving the secrecy capacity to within 4 nat under 
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the strong secrecy as well as semantic security criterion was demonstrated in ( 6 ). On the practical aspect, wiretap 
lattice codes were proposed in lfl2l and fl3l to maximize the eavesdropper’s decoding error probability. 

A. Our contribution 

Polar lattices, the counterpart of polar codes in the Euclidean space, have already been proved to be additive 
white Gaussian noise (AWGN)-good lfl4ll and further to achieve the AWGN channel capacity with lattice Gaussian 
shaping osfl Motivated by ED, we will propose polar lattices to achieve both strong secrecy and reliability over 
the mod-A s GWC. Conceptually, this polar lattice structure can be regarded as a secrecy-good lattice A e nested 
within an AWGN-good lattice At, (A e C At,). Further, we will propose a Gaussian shaping scheme over A 5 and A e , 
using the multilevel asymmetric polar coding technique. As a result, we will accomplish the design of an explicit 
lattice coding scheme which achieves the secrecy capacity of the GWC. The novel technical contribution of this 
paper is two-fold: 

• The construction of secrecy-good polar lattices for the mod-A s GWC and the proof of their secrecy capacity- 
achieving. This is an extension of the binary symmetric wiretap coding | 8 j to the multilevel coding scenario, 
and can also be considered as the construction of secrecy-good polar lattices for the GWC without the power 
constraint. The construction for the mod-A s GWC provides considerable insight into wiretap coding for the 
genuine GWC, without deviating to the technicality of Gaussian shaping. This work is also of independent 
interest to other problems of information theoretic security, e.g., secret key generation from Gaussian sources 
02 ). 

• The Gaussian shaping applied to the secrecy-good polar lattice, which follows the footpath of IT5l . The 
resultant coding scheme is proved to achieve the secrecy capacity of the GWC. This coding scheme is further 
proved to be semantically secure. The idea follows the conception of | 6 ), where lattice Gaussian sampling was 
employed to obtain semantic security. It is worth mentioning that our proposed coding scheme is not only a 
practical implementation of the secure random lattice coding in ( 6 ), but also an improvement in the sense that 
we successfully remove the constant i-nat gap to the secrecy capacity. B 

B. Comparison with the extractor-based approach 

Invertible randomness extractors were introduced into wiretap coding in 0 , HD, ED- The key idea is that 
an extractor is used to convert a capacity-achieving code with rate close to C(V) for the main channel into a 
wiretap code with the rate close to C(V) — C(W). Later, this coding scheme was extended to the GWC in |[22l . 
Besides, channel resolvability |[23l was proposed as a tool for wiretap codes. An interesting connection between 
the resolvability and the extractor was revealed in [24]. 

Please refer to G3-GD for other methods of achieving the AWGN channel capacity. 

2 The i-nat gap in (6) was due to a requirement on the flatness factor of the secrecy-good lattice. In this paper, we employ mutual information, 
rather than via the flatness factor, to directly bound information leakage, thereby removing that requirement of the secrecy-good lattice. 
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The proposed approach and the one based on invertible extractors have their respective advantages. The extractor- 
based approach is modular, i.e., the error-correction code and extractor are realized separately; it is possible to harness 
the results of invertible extractors in literature. The advantage of our lattice-based scheme is that the wiretap code 
designed for Eve is nested within the capacity-achieving code designed for Bob, which represents an integrated 
approach. More importantly, lattice codes are attractive for emerging applications in network information theory 
thanks to their useful structures HD, El; thus the proposed scheme may fit better with this landscape when 
security is a concern H26I . 

C. Outline of the paper 

The paper is organized as follows: Section II presents some preliminaries of lattice codes. In Section III we 
construct secrecy-good polar lattices for the mod-A., GWC, using the binary symmetric polar wiretap coding and 
multilevel lattice structure El- The original polar wiretap code in |8j is slightly modified to be compatible to the 
following shaping operation. In Section IV, we show how to implement the discrete Gaussian shaping over the 
polar lattice to remove the mod-A s front end, using the polar coding technique for asymmetric channels. Then we 
prove that our wiretap lattice coding achieves the secrecy capacity with shaping. Furthermore, the strong secrecy 
is extended to semantic security. Finally, we discuss the relationship between the lattice constructions with and 
without shaping in Section V. 

D. Notations 

All random variables (RVs) will be denoted by capital letters. Fet I\ denote the probability distribution of a RV 
X taking values x in a set X and let II (X) denote its entropy. For multilevel coding, we denote by X/. a RV X 
at level i. The z-th realization of X/ is denoted by x\. We also use the notation x\ 3 as a shorthand for a vector 
(x \,..., xP t ), which is a realization of RVs X*/ ? = (XJ., ...,Xj). Similarly, ar| will denote the realization of the 
z-th RVs from level ^ to level j, i.e., of X£ = (X|, ...,X*). For a set I, X c denotes its compliment set, and \I\ 
represents its cardinality. For an integer N, [IV] will be used to denote the set of all integers from 1 to N. W and 
W will be used to denote a binary memoryless asymmetric (BMA) channel and a binary memoryless symmetric 

(BMS) channel respectively. Following the notation of 0, we denote N independent uses of channel W by W N . 

(i) 

By channel combining and splitting, we get the combined channel Wn and the i-th subchannel Wf, . Specifically, 
for a channel We at level l, , We,N and W^’ N ' 1 are used to denote its N independent expansion, the combined 
channel and the i - th subchannel after polarization. !(•) denotes the indicator function. Throughout this paper, we 
use the binary logarithm, denoted by log, and information is measured in bits. 

II. Preliminaries of Fattice Codes 

A. Definitions 

A lattice is a discrete subgroup of M” which can be described by 

A = {A = Bx : x G Z n }, 
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where B is the n-by-n lattice generator matrix and we always assume that it has full rank in this paper. 

For a vector x £ R", the nearest-neighbor quantizer associated with A is Qa(x) = argmin IIA — tell. We define the 

AeA 

modulo lattice operation by x mod A = x — Qa (x). The Voronoi region of A, defined by V(A) = {x : Q a (x) = 0}, 
specifies the nearest-neighbor decoding region. The Voronoi cell is one example of fundamental region of the 
lattice. A measurable set 7Z(A) C R” is a fundamental region of the lattice A if U\^a(P(A) + A) = R" and if 
(71(A) + A) fl (71(A) + A') has measure 0 for any A ^ A' in A. The volume of a fundamental region is equal to 
that of the Voronoi region V(A), which is given by Vol(A) = |det(f?)|. 

The theta series of A (see, e.g., l28l p.70]) is defined as 


©A 0) = 


T > 0. 


AeA 


In this paper, to satisfy the reliability condition for Bob, we are mostly concerned with the block error probability 
P e ( A, cr 2 ) of lattice decoding. It is the probability Pr{.x' ^ V(A)} that an n-dimensional independent and identically 
distributed (i.i.d.) Gaussian noise vector x with zero mean and variance cr 2 per dimension falls outside the Voronoi 
region V(A). For an //-dimensional lattice A, define the volume-to-noise ratio (VNR) of A by 


7A (cr) = 


a Vol(A)i 


Then we introduce the notion of lattices which are good for the AWGN channel without power constraint. 


Definition 1 (AWGN-good lattices): A sequence of lattices A/, of increasing dimension n is AWGN-good if, for 
any fixed P e (Ab,a 2 ) £ (0,1), 


lim j Ab (ct) = 27re 


and if, for a fixed VNR greater than 27re, P e (Ab,cr 2 ) goes to 0 as n —> oo. 

It is worth mentioning here that we do not insist on exponentially vanishing error probabilities, unlike Poltyrev’s 
original treatment of good lattices for coding over the AWGN channel l29l . This is because a sub-exponential or 
polynomial decay of the error probability is often good enough. 


B. Flatness Factor and Lattice Gaussian Distribution 


For cr > 0 and c £ R", the Gaussian distribution of mean c and variance a 2 is defined as 


f<T, C ( X ) = 


1 


(s/2no) r 

for all x £ R™. For convenience, let f a (x) = f a ,o ( x )• 
Given lattice A, we define the A-periodic function 


U,, = ^ W ‘ )= (SF& 

for x £ R™. 

The flatness factor is defined for a lattice A as El 


II* —All 2 

2cr 2 


eA(cr) = max |Vol(A)/ CT A (a;) - 1| • 

xen( A) 
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It can be interpreted as the maximum variation of f a ,\(x) from the uniform distribution over TZ(A). The flatness 
factor can be calculated using the theta series ( 6 j: 


£a(o-) = 


( 7aQ) 

^ 2tt 


Ba 


( 1 


\2'k<j 2 


- 1 . 


We define the discrete Gaussian distribution over A centered at c £ R” as the following discrete distribution 
taking values in A £ A: 

Aw(A) = VAeA, 

where f TjC (A) = X^AeA AAA) = / CT , a(c). Again for convenience, we write = D A>crfi . 

It is also useful to define the discrete Gaussian distribution over a coset of A, i.e., the shifted lattice A — c: 

J a _ c , ct (A - c) = /g(A ~ VAeA. 

Jcr>c\A-) 


Note the relation /J\_ CfT (A — c) = -Da. ct , c (A), namely, they are a shifted version of each other. 

Each component of a lattice point sampled from l),\- c . a has an average power always less than cr 2 by the 
following lemma. 


Lemma 1 (Average power of lattice Gaussian 13 Of ): Let x = (x\,X 2 , ■■■,x n ) T ~ D A - ca . Then, for each 
1 < i < n. 


E[xt] < a 2 


( 2 ) 


If the flatness factor is negligible, the discrete Gaussian distribution over a lattice preserves the capacity of the 
AWGN channel. 


Theorem 1 (Mutual information of discrete Gaussian distribution 4301? ): Consider an AWGN channel Y = X + E 
where the input constellation X has a discrete Gaussian distribution Da-c,ct s for arbitrary c € R”, and where the 
variance of the noise E is cr 2 . Let the average signal power be P s so that SNR = P s /a 2 , and let a = 

V°i+ CT 

Then, if e = e A (cr) < A and < e where 



the discrete Gaussian constellation results in mutual information 


t > 1/e 
0 < t < 1/e 


Id>\ log (1 + SNR) - — 
2 n 


(3) 


per channel use. 

A lattice A or its coset A — c with a discrete Gaussian distribution is referred to as a good constellation for the 
AWGN channel if e A (cr) is negligible (30). It is further proved in (30) that the channel capacity is achieved with 
Gaussian shaping over an AWGN-good lattice and minimum mean square error (MMSE) lattice decoding. Lollowing 
Theorem Q] it has been shown in m that an AWGN-good polar lattice shaped according to the discrete Gaussian 
distribution achieves the AWGN channel capacity with sub-exponentially vanishing error probability, which means 
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that an explicit polar lattice satisfying the power constraint and the reliability condition for Bob is already in hand. 
Therefore, the next section will focus on the construction of the secrecy-good polar lattice. 


III. Secrecy-good Polar Lattices for the Mod-A s GWC 


A. Polar codes: brief review 

We firstly recall some basics of polar codes. Let W be a BMS channel with uniformly distributed input X £ 
X = {0,1} and output Y £ y. The input distribution and transition probability of W are denoted by Px and Py|x 
respectively. Let Xl jY l and Y v be the input and output vector of TV independent uses of W. Suppose N = 2 m for 
some integer m > 1, the channel polarization is resulted from the transform = X^Gjv where Gn = [i ?] um 
is the generator matrix and <g> denotes the Kronecker product. Then we get an A'-dimensional combined channel 
Wn from to For each i £ [TV], given the previous bits U 1: * -1 , the channel W$ seen by each bit IP is 
called the 7-th subchannel channel after the channel splitting process S3, and the transition probability of W$ is 
given by 

wj$\yW,u 1 -- i - 1 \u i )= Xi^lW N (yW\uW), 

u i + l:N GX N-i 

where and y^ N 1 are the realizations of and Y^, respectively. Arikan proved that wffl is also a BMS 
channel and it becomes either an almost error-free channel or a completely useless channel as TV grows. According 
to 0, the goodness of a BMS channel can be estimated by its associate Bhattacharyya parameter, which is defined 
as follows. 


Definition 2 (Bhattacharyya parameter of BMS channels): Let W be a BMS channel with transition probability 
-FV|x> the symmetric Bhattacharyya parameter Z £ [0,1] is defined as 


z(W) =£\APY|x(y|0)P Y |x(z/|i). 


It was further shown in |311. 1 32) that for any f) < \, 


lim — 

m—too 

l 


lim 

m—f oo 


{i : Z(W£>) < 2~ N } 


{i:Z(W£’)> 1-2“^} 


I(W) 

1 — I(W), 


which means the proportion of such roughly error-free subchannels (with negligible Bhattacharyya parameters) 
approaches the channel capacity I(W). The set of the indices of all those almost error-free subchannels is usually 
called the information set I and its complementary is called the frozen set T. Consequently, the construction of 
capacity-achieving polar codes is simply to identify the indices in the information set X. However, for a general 
BMS channel other than binary erasure channel, the complexity of the exact computation for Z(W$) appears to 
be exponential in the block length TV. An efficient estimation method for Z(W$) was proposed in ||33l , using the 
idea of channel upgrading and degrading. It was shown that with a sufficient number of quantization levels, the 
approximation error is negligible even if W has continuous output, and the involved computational complexity is 
acceptable. 


January 5, 2016 


DRAFT 








In (7), a bit-wised decoding method called successive cancellation (SC) decoding was proposed to show that polar 
codes are able to achieve channel capacity with vanishing error probability. This decoding method has complexity 
O(NlogN), and the error probability is given by P^ c < Yliex Z(W$). 


B. Polar codes for the binary symmetric wiretap channel 


Now we revisit the construction of polar codes for the binary symmetric wiretap channel. We use V and W to 
denote the symmetric main channel between Alice and Bob and the symmetric wiretap channel between Alice and 
Eve, respectively. Both V and W have binary input X and W is degraded with respect to V. Let Y and Z denote 
the output of V and W. After the channel combination and splitting of N independent uses of the V and W by the 
polarization transform U v = X v Cv, we define the sets of reliability-good indices for Bob and information-poor 
indices for Eve as 


g{v) = {i-.z{v^)<2- Nf> }, 


~ ~ rj\ 

where 0 < P < 0.5 and V N (Wfj ) is the i-th subchannel of the main channel (wiretapper’s channel) after 
polarization transform. 

Note that in the seminal paper aa of polar wiretap coding, the information-poor set AT(W) was defined as 
{i : I(W l ' l ’ N> ) < 2 — ;V }. In contrast, our criterion here is based on the Bhattacharyya parameter^ This slight 
modification will bring us much convenience when lattice shaping is involved in Sect. IIVI The following lemma 
shows that the modified criterion is similar to the original one in the sense that the mutual information of the 
subchannels with indices in Af(W) can still be bounded in the same form. 

Lemma 2: Let be the i-th subchannel after the polarization transform on independent N uses of a BMS 




(4) 


N 

r(i )\ 


channel W. If Z(W^’) > 1 2 A ' ; , the mutual information of the i-th subchannel can be upper-bounded as 

I(W$) < 2-^,0 < /3' </3 < 0.5, 


for sufficiently large N. 

Proof: When W is symmetric, 1 is symmetric as well. By J7] Proposition 1], we have 

I(W®) < y/l-Z(W^r 

< \/2 ■ 2 ~ Nf< < 2 ~ nP \ 

where the last inequality holds for sufficiently large N. □ 

Since the mutual information of subchannels in J\f(W) can be upper-bounded in the same form, it is not difficult 
to understand that strong secrecy can be achieved using the index partition proposed in {§]. Similarly, we divide 
the index set [N] into the following four sets: 


A = g{V)n/f{W), B = g(v)nJV(W) c 
c = g(v) c nU{W), v = g(v) c nN(wy. 


(5) 


3 This idea has already been used in {§] to prove that polar wiretap coding scheme is secrecy capacity-achieving. 
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Clearly, AU B U C U V = [N], Then we assign set A with message bits M, set B with random bits R, set C with 
frozen bits F which are known to both Bob and Eve prior to transmission, and set V with random bits R. The next 
lemma shows that this assignment achieves strong secrecy. We note that this proof is similar to that in @ and it 
is given in Appendix lAl 

Lemma 3: According to the partitions of the index set shown in (0, if we assign the four sets as follows 

Air- M, B <- R, 


( 6 ) 


C i— F, T> <T- R, 

the information leakage /(M; Zl jY l) can be upper-bounded as 


J(M; ZM) < N ■ 2~ Nf> ' , 0 <0' < 0.5. 


(7) 


With regard to the secrecy rate, we show that the modified polar coding scheme can also achieve the secrecy 
capacity. 

Lemma 4: Let C(V) and C(W) denote the channel capacity of the main channel V and wiretap channel W 
respectively. Since W is degraded with respect to V, the secrecy capacity, which is given by C(V) — C(W), is 
achievable using the modified wiretap coding scheme, i.e., 

lim \g{V)nU(W)\/N = C{V)-C{W). 

N—foo 

Proof: See Appendix [B] □ 

We can also observe that the proportion of the problematic set T> is arbitrarily small when N is sufficiently large. 
This is because set V is a subset of the unpolarized set {i : 2 ^ < Z(V$ ] ) <1 — 2 Nf> }. As has been shown in 
Ii8l , the reliability condition cannot be fulfilled with SC decoding due to the existence of V. Fortunately, we can 
use the blocking technique proposed in Q to achieve reliability and strong secrecy simultaneously. More details 
of this blocking technique will be discussed in Section IIII-DI and Section IIV-EI 


C. Secrecy-good polar lattices 

A sublattice A' C A induces a partition (denoted by A/A') of A into equivalence classes modulo A'. The order 
of the partition is denoted by |A/A'|, which is equal to the number of cosets. If |A/A'| = 2, we call this a binary 
partition. Let A/Ai/ • • • /A r _i/A' for r > 1 be an n-dimensional lattice partition chain. For each partition A ^_|/At 
(1 < i < r with convention Ao = A and A, = A') a code C't over A^_-| / At selects a sequence of representatives 
at for the cosets of A t : . Consequently, if each partition is binary, the code Cf is a binary code. 

Polar lattices are constructed by “Construction D” ]28l p.232] using a set of nested polar codes C\ C C2 ■ •• C C r 
El. Suppose Ct has block length N and the number of information bits kf for 1 < £ < r. Choose a basis 
gi, g2, • • ■ , gv from the polar generator matrix Gjv such that gi, • • • gk e span Cg. When the dimension n = 1 , the 
lattice L admits the form 1(271 

L = { E 2 ' -1 E + yzN I 4 e {0,1} } , (8) 

[i=l 2—1 J 
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where the addition is carried out in R A '. The fundamental volume of a lattice obtained from this construction is 
given by 

Vol(L) = 2~ NRc ■ Vol(K) N , 

where Rc = ^ = 7! Hf.-i denotes the sum rate of component codes. In this paper, we limit ourselves 

to the binary lattice partition chain and binary polar codes for simplicity. 



Wfl 


Fig. 2. The mod-A s Gaussian wiretap channel. 


Now we consider the construction of secrecy-good polar lattices over the mod-Ag GWC shown in Fig. [2] The 
difference between the mod-Ag GWC and the genuine GWC is the mod-A, s operation on the received signal of 
Bob and Eve. With some abuse of notation, the outputs Y^ and Z ' at Bob and Eve’s ends respectively become 

f Y [Ar l = [X [Ar l + wj^ 1 ] mod A g , 


Z W] = [x^l+W^] mod Ag. 

The idea of wiretap lattice coding over the mod-A s GWC |)6j can be explained as follows. Let A/, and A e be 
the AWGN-good lattice and secrecy-good lattice designed for Bob and Eve accordingly. Let A s C A e C At, be a 
nested chain of ^/-dimensional lattices in IR /V , where A s is the shaping lattice. Note that the shaping lattice A s here 
is employed primarily for the convenience of designing the secrecy-good lattice and secondarily for satisfying the 
power constraint. Consider a one-to-one mapping: M —» Af,/A e which associates each message m £ M to a coset 
A m £ Ab/A e . Alice selects a lattice point A £ A e (T V(A S ) uniformly at random and transmits X AV 1 = A + A m , 
where A m is the coset representative of A m in V(A e ). This scheme has been proved to achieve both reliability and 
semantic security in (6) by random lattice codes. We will make it explicit by constructing polar lattice codes in 
this section. 

Let A b and A e be constructed from a binary partition chain A/Ai/ • • • /A r _i/A r , and assume A s C A^ such 
that Ag C A; v C A e C A/ 51 Also, denote by X^ the bits encoding A N /A^f, which include all information bits 


4 This is always possible with sufficient power, since the power constraint is not our primary concern in this section. 
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for message M as a subset. We have that [X^l + wj. iV ^] mod is a sufficient statistic for X^. This can be seen 
from li27l Lemma 8 ], rewritten as follows: 

Lemma 5 (Sufficiency of mod-A output f27\j): For a partition chain A/A' (A' C A), let the input of an AWGN 
channel be X = A + B, where A £ 77(A) is a random variable, and B is uniformly distributed in A nTAf A' j. Reduce 
the output Y first to Y' = Y mod A' and then to Y" = Y' mod A. Then the mod-A map is information-lossless, 
namely /(A; Y') = J(A; Y"), which means that the output Y" = Y' mod A of mod-A map is a sufficient statistic 
for A. 

In our context, we identify A with A; v and A' with A s , respectively. Since the bits encoding A; Y /A s are uniformly 
distributee^, the mod-A Y operation is information-lossless in the sense that 

J(X£?; ZW) = /(X!f r ] ; [X^ + W^] mod Af). 

As far as mutual information /(X^; Z ) is concerned, we can use the mod-A Y operator instead of the mod-A s 
operator here. Under this condition, similarly to the multilevel lattice structure introduced in ED, the mod-A s 
channel can be decomposed into a series of BMS channels according to the partition chain A/Ai/ • • • /A r _i/A r . 
Therefore, the already mentioned polar coding technique for BMS channels can be employed. Moreover, the channel 
resulted from the lattice partition chain can be proved to be equivalent to that based on the chain rule of mutual 
information. Following this channel equivalence, we can construct an AWGN-good lattice A/, and a secrecy-good 
lattice A e , using the wiretap coding technique (Q} at each partition level. 

A mod-A channel is a Gaussian channel with a modulo-A operator in the front end li27ll . (34). The capacity of 
the mod-A channel is ED 


C(A,er 2 ) = log(Vol(A)) - h(A,o 2 ), 

where h(A,a 2 ) is the differential entropy of the A-aliased noise over V(A): 


(9) 


h(A, a 2 ) = - [ / ct ,a(X) log 
Jv(A) 


/V(A) 

The differential entropy is maximized to log(Vol(A)) by the uniform distribution over V(A). The Ap-i/Ai channel 
is defined as a mod-A^ channel whose input is drawn from Af_i fl V(A^). It is known that the A^-i/A^ channel 
is symmetriqj, and the optimum input distribution is uniform 11271 . Furthermore, the A^_i/A^ channel is binary if 
|A^_i/A^| = 2. The capacity of the A(_i/Ai channel for Gaussian noise of variance cr 2 is given by 1127) 

C(Ae-!/A e ,a 2 ) = C(A ei o 2 ) - C(A l _ ll a 2 ) 

= /i(Af_i,cr 2 ) - h(A e ,cr 2 ) + log(Vol(A^)/Vol(A^_i)). 

The decomposition into a set of channels is used in lf27l to construct AWGN-good lattices. Take the 

partition chain Z/2Z/ • • • /2 r Z as an example. Given uniform input Xi :r , let ICi denote the coset indexed by x± : £, 


5 In fact, all bits encoding A e /A s are uniformly distributed in wiretap coding. 

6 This is “regular” in the sense of Delsarte and Piret and symmetric in the sense of Gallager ED. 
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i.e., K.£ = xi + ■ ■ ■ + 2 l 1 X( + 2 f Z. Given that = xi-j-i, the conditional probability distribution function 

(PDF) of this channel with binary input X/ and output Z = Z mod At is 


/z|x*0|a?) 


1 

v/27T(Te 


E exp 

a£Ke(x 1 : e) 




( 10 ) 


Since the previous input bits x\ : p_\ cause a shift on Lit and will be removed by the multistage decoder at level t, the 
code can be designed according to the channel transition probability (flOl) with x\-,i-\ = 0. Following the notation of 
EH, we use V (A^_i JAt , <7^) and W(Ai_i/Ap, a 2 ) to denote the Ap-i/Ap channel for Bob and Eve respectively. 
The Af_i/A^ channel can also be used to construct secrecy-good lattices. In order to bound the information leakage 
of the wiretapper’s channel, we firstly express /(Xi :r ;Z) according to the chain rule of mutual information as 


J(Xi :r ; Z) = J(Xi; Z) + /(X 2 ; Z|Xr) + • • • + J(X r ; Z|X 1:r _i). 


( 11 ) 


This equation still holds if Z denotes the noisy signal after the mod-A r operation, namely, Z = [X+W e ] mod A r . We 
will adopt this notation in the rest of this subsection. We refer to the /-th channel associated with mutual information 
/(X/ : Z|X|!) as the equivalent channel denoted by 14 /, (X / : Z|X i-x — i), which is defined as the channel from Xg 
to Z given the previous Xi : ^_i. Then the transition probability distribution of IT'fX/: Z|Xj j _\) is 11271 Lemma 6] 


fz\x e {z\x e ) = 


Pr(ICi(xi-e)) 


E Pr (a)f z (z\a) 


aeKe(xi-e) 


1 1 
|Af/A r | y/2TTiJ e 


E 


exp 




2 a ? 1 


( 12 ) 


z - a\\ 2 ) , z€ V(A r ). 


From ([Tol l and (IT 2 l> . we can observe that the channel output likelihood ratio (LR) of the W{Ap--\ /At. a 2 ) channel 
is equal to that of the £-th equivalent channel W'(Xp\ Z|X|./_]). Then we have the following channel equivalence 
lemma. 

Lemma 6: Consider a lattice L constructed by a binary lattice partition chain A/Ai/ • • • /A r ._ i /A r . Constructing a 
polar code for the f-th equivalent binary-input channel W(Xf\ Z|X | : t- \) defined by the chain rule (fill is equivalent 
to constructing a polar code for the A^_i/A^ channel W(At_i/At, a 2 ). 

Proof: See Appendix ICl □ 

Note that another proof based on direct calculation of the mutual information and Bhattacharyya parameters of 
the subchannels can be found in l35l . 

Remark 1: Observe that if we define V'(Xi ; Y|Xi : ^_i) as the equivalent channel according to the chain rule expan¬ 
sion of /(X; Y) for the main channel, the same result can be obtained between V(Ae-i/A^, a 2 ) and V'{Xp ; Y|Xi : £_i). 
Moreover, this lemma also holds without the mod-A s front-end, i.e., without power constraint. The construction of 
AWGN-good polar lattices was given in [T5), where nested polar codes were constructed based on a set of A^_i/A^ 
channels. We note that the A^_i/A t channel is degraded with respect to the A^/A^ + i channel lfl5l Lemma 3]. 

Now it is ready to introduce the polar lattice construction for the mod-A s GWC shown in Fig. [3] A polar lattice 
L is constructed by a series of nested polar codes C\(N,ki) C C 2 (N, kf) C • • • C C r (N,k r ) and a binary 
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Fig. 3. The multilevel lattice coding system over the mod-A s Gaussian wiretap channel. 


lattice partition chain A/Ai/ • • • / A r . The block length of polar codes is N. Alice splits the message M into 
Mi, • • •, M r . We follow the same rule © to assign bits in the component polar codes to achieve strong secrecy. 
Note that W(Ae-i/A(, is degraded with respect to V(Ae_i/Ae,a%) for 1 < i < r because of < Treating 
V(Ae_i/A(, a%) and W(A(_i/A^,a^) as the main channel and wiretapper’s channel at each level and using the 
partition rule ©, we can get four sets Ae, Bp, Ci and 'Of. Similarly, we assign the bits as follows 


At -s— M^, Be Rf, 

Ce F e, 'Dt Rf 


(13) 


for each level t, where M/, F, and R/ represent message bits, frozen bits (could be set as all zeros) and random 
bits at level i. Since the Ae-\/Ae channel is degraded with respect to the Ae/Ae+\ channel, it is easy to obtain that 
Ce 3 Ce- i-i, which means AeU BeWDe C Ae+i U Be+i U Ve+i- This construction is clearly a lattice construction 
as polar codes constructed on each level are nested. We skip the proof of nested polar codes here. A similar proof 
can be found in nn. 

As a result, the above multilevel construction yields an AWGN-good lattice A & and a secrecy-good lattice A e 
simultaneous!}!^. More precisely, A h is constructed from a set of nested polar codes Ci(N, |.4i| + \B\\ + \C>i\) Q 
■ ■ ■ C C r (N , \A r \ + \B r \ + \D r \), while A e is constructed from a set of nested polar codes C\(N , \B\\ + \Di\) C 
• • • C C r (N , \B r \ + \Dr |) and with the same lattice partition chain. Note that the random bits in set T>e should be 
shared to Bob to guarantee the AWGN-goodness of A/,. More details are given in the next subsection. It is clear 
that A e C A;,. Thus, our proposed coding scheme instantiates the coset coding scheme introduced in |[6]|, where the 
confidential message is mapped to the coset X m £ At/A e . 


7 In this paper, a sequence of lattices A e of increasing dimension is called secrecy-good if they achieve the strong secrecy capacity 
asymptotically. Note that this definition is different from that in (6), which is based on the flatness factor. 
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By using the above assignments and Lemma 0 we have 


I(Mr,zf ] ) < N2~ nP ', (14) 

where = Z' v l mod A/-. In other words, the employed polar code for the channel 14 / (A/_i /Ap. of ) can guarantee 
that the mutual information between the input message and the output is upper bounded by NT ;V ' . According to 
Lemma [6] this polar code can also guarantee the same upper bound on the mutual information between the input 
message and the output of the channel W'()kf, Z|X j -j_i ) as shown in the following inequality (X/ is independent 
of the previous 

/(M £ ;Z [Arl ,X^ ] _i) < N2~ n ^'. 


Recall Z^l is the signal received by Eve after the mod-A r operation. From the chain rule of mutual information, 

/(IVhZ^ 1 ) 


= ^/( Z [n] ;M { |M 1:M ) 

1=1 

r 

= ]T H( Mr|M 1: r_i) - M 1: n) 

t=l 

r 

£=1 

r 

= ^/(M £ ;Z[ jv 1,M 1: ,_ 1 ) 
r =1 

< ]Tl( M^Z^lxlli) < rN2~ NP ', 
e=i 


(15) 


where the last inequality holds because /(M^; Z^, X^_i) = /(M^; Z^, U^_i) and adding more variables will 
not decrease the mutual information. Therefore strong secrecy is achieved since limjv-s-oo /(M; Z^) =0. 


Remark 2: Note that the above analysis actually implies semantic security, i.e.. l fl5] > holds for arbitrarily distributed 
M. This is because of the symmetric nature of the Af,/A e channel l27l . Since the message M is drawn from 7Z(A e ) 
and the random bits are drawn from A e n7Z(A s ), by Lemma0 the mod-A e map is information lossless and its output 
is a sufficient statistic for M. In this sense, the channel between the confidential message and the Eavesdropper’s 
signal can be viewed as a A & / A e channel. Since the A;, / A e channel is symmetric, the maximum mutual information 
is achieved by the uniform input. Consequently, the mutual information corresponding to other input distributions 
can also be upper bounded by rN2~ Nf> in ( fl5l ). It is worth mentioning this A{,/A e channel can be seen as the 
counterpart in lattice coding of the randomness-induced channel defined in (8). 


Theorem 2 (Achieving secrecy capacity of the mod- A s GWC): Consider a polar lattice L constructed according 
to (fTU i with the binary lattice partition chain A/Ai/ • • • /A r and r binary nested polar codes with block length N. 
Scale A and r to satisfy the following conditions: 

(i) /i(A,of) -A log(Vol(A)), 
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(ii) /i(A r ,of) -» \ log(27re<jg). 

Given of > of, all strong secrecy rates R satisfying 


R < I lo s 

z a b 


are achievable as N —> oo, using the polar lattice L on the mod-A s Gaussian wiretap channel. 
Proof: By Lemma [4] and (IT3l) . 


lim R = lim 
N—>oo * ^ N—>oo N 
e=i 

r 

= Y J cm~c{w t ) 

£=1 

r 

= Y,C(V(A e - 1 /A e ,a 2 b )) - C\W{ A^/A,, o e 2 )) 
£=1 


= C(V(A/A r ,a 2 )) - C(W(A/A r ,o 2 )) 

= C(A r , a 2 ) - C(A, a 2 ) - C(A r , a 2 ) + C(A, a 2 ) 


(16) 


= h(A r , of) - h(A r , of) + h( A, of) - h( A, o 2 ) 

1 , C>f \ 

/ cr fo 

where 

ei = h(A,a 2 ) - /i(A,of) > 0 , 

< e b = h(a b ) - h(A r , <j 2 ) = ± log(27reof) - /i(A r , of) > 0, 

= ft(of) - h(A r , of) = | log(27reof) - /i(A r ,of) > 0 

and e e — eb> 0 . 

By scaling A so that the mod-A noise is almost uniform, we can have /i(A, of) —> log(V(A)). Since a 2 > of, 
we also have /i(A,of) —> log(F(A)) and thus ei ss 0. The number of levels is also increased until h(A ri a 2 ) « 
i log( 2 - 7 reof), hence h(A r ,a b ) « 4 log(27reof), such that both ef, and e e are almost 0. Therefore by scaling Ai 

1 <j 2 

and adjusting r, the secrecy rate can get arbitrarily close to ^ log ^§. □ 

Remark 3: The secrecy capacity of the mod-A s Gaussian wiretap channel per use is given by 

C. = ^C{A sl a 2 ) - ^C(A s ,a 2 ) = 1 h(A s ,a 2 e ) - ^h(A s ,a 2 ) 

since the wiretapper’s channel is degraded with respect to the main channel. Because /i(A r ,of) —> i log(27reof) 
and A s C A^f, we have j^h(A s ,a 2 ) —> ^ log( 2 - 7 reof) and jjh(A s ,cr b ) ^ log(27reof). Hence C s -»• ^log^f. 
It also equals the secrecy capacity of the Gaussian wiretap channel when the signal power goes to infinity. It is 
noteworthy that we successfully remove the - 1 -nat gap in the achievable secrecy rate derived in ( 6 | which is caused 
by the limitation of the L°° distance associated with the flatness factor. 


Remark 4: The mild conditions (0> and (ITT]) stated in the theorem are easy to meet, by scaling top lattice A and 
choosing the number of levels r appropriately. Consider an example for of = 4 and of = 1. We choose r = 3 


January 5, 2016 


DRAFT 




16 


levels and a partition chain Z/2Z/4Z with scaling factor 2.5. The difference between the achievable rate computed 
from <n~ 6 l> and the upper bound i log on secrecy capacity is only 0.05. 

Remark 5: From conditions © and ©, we can see that the construction for secrecy-good lattices requires 
more levels than the construction of AWGN-good lattices. t\ can be made arbitrarily small by scaling down A 
such that both h{ A, cr 2 ) and ft.(A, cr 2 ) are sufficiently close to log 14(A). For polar lattices for AWGN-goodness 
m. we only need /j(A r /,cr 2 ) ~ ^log(2nea^) for some r' < r. Since ep, < e e , A r / may be not enough for the 
wiretapper’s channel. Therefore, more levels are needed in the wiretap coding context. To satisfy the condition 
h(A r ,a 2 ) —» ^ log(27reijg), it is sufficient to guarantee that P e (A r ,<r 2 ) —> 0 by 11271 Theorem 13], When one- 
dimensional binary partition Z/2Z/4Z/... is used, we have P e (A r ,a 2 ) < < e where Q(-) is the 

Q-function. Letting r = O(logiV), the error probability vanishes as P e (A r ,cr 2 ) = e~°( N \ which implies that 
h(A r , cr 2 ) —> \ log(27recTg) as N -y oo. 

D. Reliability 

In the original polar coding scheme for the binary wiretap channel ( 8 ), how to assign set D is a problem. 
Assigning frozen bits to V guarantees reliability but only achieves weak secrecy, whereas assigning random bits 
to V guarantees strong secrecy but may violate the reliability requirement because T> may be nonempty. In order 
to ensure strong secrecy, V is assigned with random bits {V •<— R), which makes this scheme failed to accomplish 
the theoretical reliability. For any Pth level channel V(Ac-i/Ac, cr 2 ) at Bob’s end, the probability of error is upper 
bounded by the sum of the Bhattacharyya parameters Z(vj^\ A^_i/A^, of)) of subchannels that are not frozen to 
zero. For each bit-channel index j and 15 < 0.5, we have 

jee(14(A^_ 1 /Ar,a & 2 ))UP,. 

By the definition ©. the sum of (A^_i/A^, cr 2 )) over the set ty(I4(A^_i/A^,CT 2 ) is bounded by 2 _Ar/3 , 

therefore the error probability of the f-th level channel under the SC decoding, denoted by /), sc; (A /_ 1 /Ap, cr 2 ), can 
be upper bounded by 

P e sc (At-JAtrf) < N2- nP + J2 Z(vU ) (At- 1 /A t ,o%)). 

jeVe 

Since multistage decoding is utilized, by the union bound, the final decoding error probability for Bob is bounded 
as 

r 

Pr{M ^ M} < ^P ( f c (A,_ 1 /A,,a 2 ). 

i= 1 

Unfortunately, a proof that this scheme satisfies the reliability condition cannot be attained here because the bound 
of the sum Z{V^ (Ae-i/Ag, cr 2 )) is not known. Note that significantly low probabilities of error can still 

be achieved in practice since the size of 'Dp is very small for sufficiently large N. 

The reliability problem was recently solved in |9|, where a new scheme dividing the information message into 
several blocks was proposed. For a specific block, ’Dp is still assigned with random bits and transmitted in advance 
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in the set At of the previous block. This scheme involves negligible rate loss and finally realizes reliability and 
strong security simultaneously. In this case, if the reliability of each partition channel can be achieved, i.e., for any 
f-th level partition A^_i/A^, P e sc '(A^_i/A^, a 2 ) vanishes as N —> oo, then the total decoding error probability for 
Bob can be made arbitrarily small. Consequently, based on this new scheme of assigning the problematic set, the 
error probability on level i can be upper bounded by 

P? c (Ae-i/At, a 2 b ) < e e N , + k e • 0(2~ N ' P ), (17) 

where kg is the number of information blocks on the f-th level, N' is the length of each block which satisfies 
N' x ke = N and e%, is caused by the first separate block on the f-th level consisting of the initial bits in 
T>t. Since 'Dt | is extremely small comparing to the block length N, the decoding failure probability for the 
first block can be made arbitrarily small when N is sufficiently large. Meanwhile, by the analysis in 031 . when 
h(A,a%) — > log(V(A)), /t(A^of) — > \ log(27recr^), and Rc — > C(A/A r ,ct%), we have 7 a, b (er&) 27te. Therefore, 
Af, is an AWGN-good latticcl 

Note that the rate loss incurred by repeatedly transmitted bits in T>t is negligible because of its small size. 
Specifically, the actual secrecy rate in the f-th level is given by k ^ +l [C , (Ai_i/Ai, o 2 ) — C(A^_i/Af, cr%)\. Clearly, 
this rate can be made close to the secrecy capacity by choosing sufficiently large ki as well. 

IV. Secrecy-good polar lattices with discrete Gaussian shaping 

In this section, we apply Gaussian shaping on the AWGN-good and secrecy-good polar lattices. The idea of 
lattice Gaussian shaping was proposed in lf30l and then implemented in fl5l to construct capacity-achieving polar 
lattices. For wiretap coding, the discrete Gaussian distribution can also be utilized to satisfy the power constraint. 
In simple terms, after obtaining the AWGN-good lattice A& and the secrecy-good lattice A e , Alice still maps each 
message to to a coset A m e A&/A e as mentioned in Sect. |IlT] However, instead of the mod-A s operation, Alice 
samples the encoded signal X jY from D where A m is the coset representative of A m and a 2 s is arbitrarily 

close to the signal power P s (see g) for more details). The construction of polar lattices with Gaussian shaping is 
reviewed in Sect. IIV-AI With Gaussian shaping, we propose a new partition of the index set for the genuine GWC 
in Sect. IIV-BI Strong secrecy is proved in Sect. IIV-CI and extension to semantical security is given in Sect. IIV-DI 
Reliability is discussed in Sect. IIV-EI Moreover, we will show that this shaping operation does not hurt the secrecy 
rate and that the strong secrecy capacity can be achieved. 

A. Gaussian shaping over polar lattices 

As shown in 021, the shaping scheme is based on the technique of polar codes for asymmetric channels. For 
the paper to be self-contained, a brief review will be presented in this subsection. A more detailed explanation of 
this Gaussian shaping technique can be found in lfl5l . 

8 More precisely, to make AWGN-good, we need P e (Ab, cr^) —> 0 by definition. By 1151 Theorem 2], P e (Ab, cr^) ^ rN2~ N ^ + 
N • P e ( A r ,cr^). According to the analysis in Remark [5] r = O(logA) is sufficient to guarantee P e (A r ,a= e~°^ N \ meaning that a 
sub-exponentially vanishing P e (A&, a^ ) can be achieved. 
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Similarly to the polar coding on symmetric channels, the Bhattacharyya parameter for a binary memoryless 
asymmetric (BMA) channel is defined as follows. 

Definition 3 (Bhattacharyya parameter for BMA channel): Let W be a BMA channel with input X £ X = {0,1} 
and output Y £ y. The input distribution and channel transition probability is denoted by I f and Py|x respectively. 
The Bhattacharyya parameter Z for W is the defined as 

Z(X|Y) = 2^P Y (y)^P X |Y(0|?/)Px|Y(l|2/) 
y 

= 2^/Px,y(0,2/)P x ,y(1,2/). 
v 

The following lemma, which will be useful for the forthcoming new partition scheme, shows that by adding 
observable at the output of W, Z will not decrease. 

Lemma 7 (Conditioning reduces Bhattacharyya parameter Z SB): Let (X, Y, Y') ~ P X ,Y,Y', X £ X = {0,1}, Y £ 
y, Y' £ y', we have 

Z(X|Y,Y') < Z(X|Y). 

When X is uniformly distributed, the Bhattacharyya parameter of BMA channels coincides with that of BMS 
channels defined in Definition [2] Moreover, the calculation of Z can be converted to the calculation of the 
Bhattacharyya parameter Z for a related BMS channel. The following lemma is implicitly considered in ll36i 
and then explicitly expressed in |15]. We show it here for completeness. 

Lemma 8 (From Asymmetric to Symmetric channel SIB): Let W he a binary input channel corresponding to 
the asymmetric channel W with input X £ X = {0,1} and output Y £ {y,X}. The input of W is uniformly 
distributed, i.e., Pyfx = 0) = P^ix = 1) = The relationship between W and W is shown in Fig. [4] Then W is 
a binary symmetric channel in the sense that Py^(y, x © x \ x ) = Py,x{y > x )• 


X 


w 


X- 




> Y 

>X©X 


W 


Fig. 4. The relationship between W and W. 


The following lemma describes how to construct a polar code for a BMA channel W from that for the associated 
BMS channel W. 

Lemma 9 (The equivalence between symmetric and asymmetric Bhattacharyya parameters )36\I l: For a BMA 
channel W with input X ~ Px, let W be its symmetrized channel constructed according to Lemma [8] Suppose 
Xt jV l and YI jV 1 be the input and output vectors of W N , and let Xl ;V l and Y^l = ^Xl jV l © X^l, Y^l^ be the input 


January 5, 2016 


DRAFT 













19 


and output vectors of W N . Consider polarized random variables and U' x '-=Xf :V lG.v, and denote by 

Wn and Wn the combining channel of TV uses of W and W, respectively. The Bhattacharyya parameter for each 
subchannel of Wn is equal to that of each subchannel of Wn, i.e., 

Z(U i |U 1:i “ 1 ,Y [JV] ) = Z^lO^.X^eX^.Y^). 

To obtain the desired input distribution of l\ for W, the indices with very small Z(U®|U 1: * _1 ) should be 
removed from the information set of the symmetric channel. Following lfl5l . the resultant subset is referred to as 
the information set I for the asymmetric channel W . For the remaining part I c , we further find out that there 
are some bits which can be made independent of the information bits and uniformly distributed. The purpose of 
extracting such bits is for the interest of our lattice construction. We name the set that includes those independent 
frozen bits as the frozen set T, and the remaining bits are determined by the bits in T LJ I. We name the set of all 
those deterministic bits as the shaping set S. The three sets are formally defined as follows: 

’ frozen set: T = {i G [TV] : Z(U®|U 1:i_1 , Y^) > 1-2"^} 

< information set: 1 = {i £ [N] : Z(U i |U 1:i_1 , Y^) < 2~ nP and Z^IU 1 ^" 1 ) > 1 - 2"^} (18) 

shaping set: S = (J : Ul) c . 

To identify these three sets, one can use Lemma [9] to calculate Z(U®|U 1:,_1 , Y^^X^) using the known 
constructing techniques for symmetric polar codes ED OH. We note that Z(U l |U 1: * 1 ) can be computed in a 
similar way, by constructing a symmetric channel between X and X © X. Besides the construction, the decoding 
process for the asymmetric polar codes can also be converted to the decoding for the symmetric polar codes. 

The polar coding scheme according to (fl8l >. which can be viewed as an extension of the scheme proposed in lf36l . 
has been proved to be capacity-achieving in on. Moreover, it can be extended to the construction of multilevel 
asymmetric polar codes. 

Theorem 3 (Construction of multilevel polar codes SBS): Consider a polar code with the following encoding 
strategy for the channel of the f-th (£ < r ) level Wf with the channel transition probability Py\x e .x 1 . e - 1 (y\ x e, 

• Encoding: Before sending the codeword = u^Gn, the index set [TV] are divided into three parts: the 

frozen set Tf , information set If , and shaping set St, which are defined as follows: 

’ Ti = {% G [TV] : Z(Uj lU^-SX^.YW) > 1 - 2"^} 

< Ie = {i£ [TV] : Z{W^- 1 , X^L 1} Y^) < 2~ N ' and Z{ X^Lj > 1 - 2"^} 

_ Se = {Tt Ul e ) c . 

The encoder first places uniformly distributed information bits in If. Then the frozen set Tf is filled with a 
uniform random sequence which is shared between the encoder and the decoder. The bits in St are generated 
by a random mapping which yields the following distribution: 

{ 0 with probability P [jv] (0|u) : * _1 ,xKLi)) 

u f |u, (19) 

1 with probability ln i ;i _i [jv] (1|uJ : * _1 ,xKLi)- 

UpivJf, . y\. i . /) — i 
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Then any message rate arbitrarily close to /(Xf; Y|Xi : ^_i) is achievable using the SC decoding)] and the 
expectation of the decoding error probability over the randomized mappings satisfies [P e ((j>s e )] = 0(2 ') 
for any (3' < /3 < 0.5. 

Now let us pick a suitable input distribution Px lr to implement the shaping. As shown in Theorem Q] the 
mutual information between the discrete Gaussian lattice distribution D^ as and the output of the AWGN channel 
approaches | log(l + SNR) as the flatness factor eA(5) —► 0. Therefore, we use the lattice Gaussian distribution 
Px ~ Da.cts as the constellation, which gives us lim^oo Px l r = Px ~ Da.<j s - By fl5l Lemma 5], when N — > oo, 
the mutual information /(X r ; Y|Xi :r _i) at the bottom level goes to 0 if r = 0(loglog N), and using the first r 
levels would involve a capacity loss ^2 i>r /(X^; Y|Xi^_i )<o(±). 

From the chain rule of mutual information, 

r 

/(X i:r;Y) = ^/(X<;Y|X 1:/ _ 1 ), 

t=i 


we have r binary-input channels and the f-th channel according to /(X /2 Y|X| : />_j) is generally asymmetric with 

the input distribution Px e \x 1 . e _ 1 (1 < i < r). Then we can construct the polar code for the asymmetric channel 

at each level according to Lemma [8] It is shown in [15] that the f-th symmetrized channel is equivalent to the 

MMSE-scaled A^_i/A^ channel in the sense of channel polarization. 

Therefore, when power constrain is taken into consideration, the multilevel polar codes before shaping are 

constructed according to the symmetric channel V(Af_i/Af, d%) and W(Ae_i/A(,dl), where = ( J s ° h 2 ) 

^ V v a s + cr b / 

and cip = ( jLs22M ) are the MMSE-scaled noise variance of the main channel and of the wiretapper’s channel, 

\v / ‘ T »+"L 

respectively. This is similar to the mod-A s GWC scenario mentioned in the previous section. The difference is that 
and are replaced by and af accordingly. As a result, we can still obtain an AWGN-good lattice Ah and 
a secrecy-good lattice A e by treating V(Ag-i/Ae, a%) and W(A^_i/A^, as the main channel and wiretapper’s 
channel at each level. 


B. Three-dimensional partition 

Now we consider the partition of the index set [ N] with shaping involved. According to the analysis of asymmetric 
polar codes, we have to eliminate those indices with small X^j_i) from the information set of the 

symmetric channels. Therefore, Alice cannot send message on those subchannels with Z{ U^|U^ :l_1 , x[^Li) < 
1 — 2~ nP . Note that this part is the same for L) and Wf, because it only depends on the shaping distribution. At 
each level, the index set which is used for shaping is given as 

5/ = {* G [N] ■■ Z(U}< 1 - 2~ n "}, 

9 it is possible to derandomize the mapping $ 5 ^ for the purpose of achieving capacity alone. However, it is tricky to handle the random 
mapping in order to achieve the secrecy capacity: it requires either to share a secret random mapping or to use the Markov block coding 
technique (see Sect. am - 
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and the index set which is not for shaping is denoted by S%. Recall that for the index set [N], we already have 
two partition criteria, i.e, reliability-good and information-bad (see (@}). We rewrite the reliability-good index set 
Ge and information-bad index set A'/ at level £ as 

Ge = {i€ [iV] : < 2"^}, 

( 20 ) 

A fe = {i€ [AT] : Z(Ui|L)J :i_ 1 ,X^ 1 _ 1 , Z [Ar| ) >1-2“^}. 

Note that Ge and A/) are defined by the asymmetric Bhattacharyya parameters. Nevertheless, by Lem maraud the 
channel equivalence, we have Ge = G(Vt) and Me = AT(We) as defined in {!]), where Vt and We are the respective 
symmetric channels or the MMSE-scaled Ae-i/Ae channels for Bob and Eve at level £. The four sets At-, Be, Ce, 
and T>e are defined in the same fashion as ©, with Ge and Me replacing G(Ve) and M(We), respectively. Now the 
whole index set [N] is divided like a cube in three directions, which is shown in Fig. [5] 


Good for Bob 


Bad for Bob 



not poor for Eve A information poor for Eve 


Fig. 5. Partitions of the index set [N] with shaping. 

Clearly, we have eight blocks: 

Af = Ae n St, Af = Ae n Si 

Bf = B e n S t , Bf =Be n Si 

(21) 

Cf = Ce n Se, Cf = Ce (T Si 

vf = Ve n s t , vf =v e ns c t 

By Lemma [7| we observe that Af = Cf = 0, Af = At, and Cf = Ce. The shaping set St is divided into two 
sets Bf and T>f. The bits in St are determined by the bits in S) according to the mapping. Similarly, Sf is divided 
into the four sets Af c = At, Bf c , Cf c = Ce, and 'Df'. Note that for wiretap coding, the frozen set becomes Cf ', 
which is slightly different from the frozen set for channel coding. To satisfy the reliability condition, the frozen set 
Cf and the problematic set Vf cannot be set uniformly random any more. Recall that only the independent frozen 
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set Tt at each level, which is defined as {* £ [N] : Z(UJ|Uj : ® —1 , > 1 — 2 _Ar<3 }, can be set uniformly 

random (which are already shared between Alice and Bob), and the bits in the unpolarized frozen set Tt, defined 
as {i £ [N] : 2~ nP < Z{ U^|U| : * —1 ,Y[ JV ],X^|_ 1 ) < 1 — 2~ Nf> }, should be determined according to the mapping. 
Moreover, we can observe that Tp C Cf and Vf C Dp C Tp. Here we make the bits in Tp uniformly random 
and the bits in Cf \ Tp and 'Df determined by the mapping. Therefore, from now on, we adjust the definition of 
the shaping bits as: 

S e = {i£ [N] : Z{ U^U^ 1 , X^Li) < 1 - 2~ nP or 2"^ < Z(Uj|Uj :i_1 , < 1-2"^}, (22) 

which is essentially equivalent to the definition of the shaping set given in Theorem [3] 

To sum up, at level l, we assign the sets Af , Bf , and Tt with message bits M<>, uniformly random bits 
R/ : , and uniform frozen bits F,. respectively. The rest bits (in St) will be fed with random bits according to 
Y [.vi • Clearly, this shaping operation will make the input distribution arbitrarily close to Px,\x 1 ._ 1 . In 
this case, we can obtain the equality between the Bhattacharyya parameter of asymmetric setting and symmetric 
setting (see Lemma|9|. This provides us a convenient way to prove the strong secrecy of the wiretap coding scheme 
with shaping because we have already proved the strong secrecy of a symmetric wiretap coding scheme using the 
Bhattacharyya parameter of the symmetric setting. A detailed proof will be presented in the following subsection. 
Before this, we show that the shaping will not change the message rate. 

Lemma 10: For the symmetrized main channel Vp and wiretapper’s channel We, consider the reliability-good 
indices set Qt and information-bad indices set Mt defined as in (l20l >. By eliminating the shaping set St from the 
original message set defined in (|5|, we get the new message set Af = Qt IT N't H S The proportion of \Af \ 
equals to that of \At\, and the message rate after shaping can still be arbitrarily close to | log ?§■. 

~ 2 

Proof: By Theorem[2] when shaping is not involved, the message rate can be made arbitrarily close to 1 log . 
By the new definition (l22l > of St, we still have Af = 0, which means the shaping operation will not affect the 
message rate. □ 

C. Strong secrecy 

In this subsection, we introduce a new induced channel from Eve’s perspective and prove that the information 
leakage over this channel in vanishing at each level in Lemma [TT] Then, strong secrecy is proved by using the 
chain rule of mutual information as in (IT5 ] i. 

In a, an induced channel is defined in order to prove strong secrecy. Here we call it the randomness-induced 
channel because it is induced by feeding the subchannels in the sets Bt and ’Dp with uniformly random bits. 
However, when shaping is involved, the set Bt and ‘Dp are no longer fed with uniformly random bits. In fact, some 
subchannels (covered by the shaping mapping) should be fed with bits according to a random mapping. We define 
the channel induced by the shaping bits as the shaping-induced channel. 

Definition 4 (Shaping-induced channel): The shaping-induced channel Q ; \TH', S) is defined in terms of N uses 
of an asymmetric channel W, and a shaping subset S of [N] of size |<S|. The input alphabet of Q y ( W, S ) is 
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Fig. 6. Block diagram of the shaping-induced channel Qn(W, S). 


{0,and the bits in S are determined by the input bits according to a random shaping $ 5 . A block diagram 
of the shaping induced channel is shown in Fig. [ 6 ] 

Based on the shaping-induced channel, we define a new induced channel, which is caused by feeding a part of 
the input bits of the shaping-induced channel with uniformly random bits. 

Definition 5 (New induced channel): Based on a shaping induced channel Qn(W, S), the new induced channel 
Qn{W, S, TZ) is specified in terms of a randomness subset TZ of size \TZ\. The randomness is introduced into the 
input set of the shaping-induced channel. The input alphabet of Qn(W,S,TZ) is {0, l} Ar- l 5 H’ R anc | the bits in 
TZ are uniformly and independently random. A block diagram of the new induced channel is shown in Fig. [7] 



Fig. 7. Block diagram of the new induced channel Qn(W,S,1Z). 

The new induced channel is a combination of the shaping-induced channel and randomness-induced channel. 
This is different from the definition given in JS) because the bits in S are neither independent to the message bits 
nor uniformly distributed. As long as the input bits of the new induced channel are uniform and the shaping bits are 
chosen according to the random mapping, the new induced channel can still generate 2 N possible realizations 
of as N goes to infinity, and those x 1 ^ can be viewed as the output of N i.i.d binary sources 
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with input distribution These are exactly the conditions required by Lemma [9] Specifically, we have 

Z(U^|Uj :l_1 ,x!j^ 1 _ 1 ,Z[ JV ]) = Z(U||0] : * _1 , X^_ 1; ® X^, Z^l). In simple words, this equation holds when 

and are all selected from {0,1}^ according to their respective distributions. Then we can exploit the 

relation between the asymmetric channel and the corresponding symmetric channel to bound the mutual information 
of the asymmetric channel. Therefore, we have to stick to the input distribution (uniform) of our new induced 
channel and also the distribution of the random mapping. This is similar to the setting of the randomness induced 
channel in (8), where the input distribution and the randomness distribution are both set to be uniform. In (8], the 
randomness-induced channel is further proved to be symmetric; then any other input distribution can also achieve 
strong secrecy and the symmetry finally results in semantic security. In this work, however, we do not have a proof 
of the symmetry of the new induced channel. For this reason, we assume for now that the message bits are uniform 
distributed. To prove semantic security, we will show that the information leakage of the symmetrized version of 
the new induced channel is vanishing in Sect. IIV-DI 


Lemma 11: Let be the uniformly distributed message bits and F/ be the independent frozen bits at the input 
of the channel at the f-th level. When shaping bits Sr are selected according to the random mapping «l> 5 f ^ and 
N is sufficiently large, the mutual information can be upper-bounded as 

J(M/F /; ZW X$_i) < 0{N 2 2~ nP '). 


Proof: We firstly assume that U» is selected according to the distribution P l|i|M i ;i _i v [jv] for all i £ [AT], i.e., 

{ 0 with probability P Mi|M i :i _i v [iv] (0|wi : * _1 , x \ N )_,), 

^ (23) 

1 with probability P (J i^1:4-1 x [w] 

for all i £ [iV], In this case, the input distribution Pxf\x,.,_, at each level is exactly the optimal input distribution 
obtained from the lattice Gaussian distribution. The mutual information between M, Fr and (Z V 1, xKj j) in this 
case is denoted by 

For the shaping induced channel Qn(W(, Sg, IZc) CJZp is Bf according to the above analysis), we write the 
indices of the input bits (Sp U lZe) c = [N] \ (Sp U IZe) as {ii, * 2 ,..., iN-s e -r e }’ where \1Z\ = re and |Sr| = se, and 


°As we will see in Sect. Irv-El to achieve reliability, Alice needs to secretly share with Bob, or to use the Markov block coding 


technique. 
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assume that i\ < %2 < • • • < iN-si-r r We have 

Jp(M,F*;ZW X&U) = J P (uf^zW, X&U) 

= Ip(U\\ Uf ,..., X^L) 


N—r^ — si 


- E 

j= 1 

MU? 

. y\ N ] Y [iV] 

t|Uf 

11*2 

, . 

U Zj_ 

- 1 ) 

N—rg—i 

?£ 






= E 

j =i 

Ip(U\ j 

. yW Y [iV] 

i.Uj 1 

.U?, 

U^' 

- 1 ) 

N-rt- 

S£ 






S E 

Ip^i 

i . 7^1 

-rUl 

,u, 2 ,. 

u^ _ 



3 =1 

where (a) holds because adding more variables will not decrease the mutual information. 

Then the above mutual information can be bounded by the mutual information of the symmetric channel plus 
an infinitesimal term as follows: 

N—rg—se 

3 =1 

(a) N ~ r ese 

< J2 AuJ^z^.x^Lj.x^ ©xf ] ,u^ 3_1 ) + ^(u^iz^^x^L^xf 1 ©xf ] ,uf 3_1 ) 

i=i 

N—r£—S£ 

- J2 U^IZ^.X^Li.U^- 1 ) 

j=l 

< ^ /(u^z^^x^L^xf 1 ©^ 1 ^^' -1 ) 

i=i 

N—r£—S£ 

i=l 

< J (uy; z^USL^xf 1 © Xf ] , Of 3 " 1 ) + N2~ n/> 

3 =1 


< + N2~ nP 

< 2N2~ nP ' 


for 0 < /3 ' < (3 < 0.5. Inequalities (a)-(d) follow from 

(а) uniformly distributed ll/ 1 , 

(б) lf38l Proposition 2] which gives H(X\Y) — fT(X|Y, Z) < Z(X|Y) — (Z(X|Y, Z) 2 ) and Lemma[9] 

(c) our coding scheme guaranteeing that Z(Ul 3 |Z, xf Ilf 3 X ) is greater than 1 — 2~ N ' 3 for the frozen 
bits and information bits, 

( d ) Lemma |2] 

For wiretap coding, the message M/ : , frozen bits F, and random bits R/ are all uniformly random, and the 
shaping bits S* are determined by SS according to <f>s.. Let Qui-'-'i yt v i 7 rwi denote the joint distribution of 
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X^_j, Z^l) resulted from uniformly distributed M^FfR^ and according to By the proofs of [13] 
Th. 5] and [B] Th. 6], the total variation distance can be bounded as 


IIQiiW X [N] Z[N] - P..m X [N] Z[N] || < N2 


— N 13 ' 


(24) 


for sufficiently large N. 

By f39l Proposition 5], the mutual information 7(M^F^; Z^l, X^_ x ) caused by (^[w] x [wj Z[JV] satisfies 

I(M e F f , ZW,xSLi) - 7p(M,F^;Z[ iV ] I X^ 1 _ 1 )| < 7N2~ nP ' \og2 N + h 2 {N2~ NP ') + h^N2~ Nf> ') 

= 0{N 2 2~ Nf> '), 


where /v 2 (■) denotes the binary entropy function. 


□ 


Finally, strong secrecy (for uniform message bits) can be proved in the same fashion as shown in ( 031 ) as: 

J(M;ZW) < £/(M^Z^USlr) < £j(M,F /; zW X^U) = 0(rN 2 2"^'). 
i =\ £=1 

Therefore we conclude that the whole shaping scheme is secure in the sense that the mutual information leakage 
between M and Z^l vanishes with the block length N. 


D. Semantic security 

In this subsection, we extend strong secrecy of the constructed polar lattices to semantic security, namely the 
resulted strong secrecy does not rely on the distribution of the message. We take the level-1 wiretapper’s channel 
W\ as an example. Our goal is to show that the maximum mutual information between M 1 F 1 and Z v is vanishing 
for any input distribution as N —> 00 . Unlike the symmetric randomness induced channel introduced in (8), the 
new induced channel is generally asymmetric with transition probability 

Q( z \ v ) = w i'( z \( v ’ e ^Si (v,e))G N ), 

■Ss! eG{0,l} r i 

where ©s, (v, e) represents the shaping bits determined by v (the frozen bits and message bits together) and e (the 
random bits) according to the random mapping >f> 5 1 . It is difficult to find the optimal input distribution to maximize 
the mutual information for the new induced channel. 

To prove the semantic security, we investigate the relationship between the i-th subchannel of W\,n and the i-th 
subchannel of its symmetrized version Wi t N, which are denoted by w[‘' N ’ and W^’ N \ respectively. According to 
Lemma |8] the asymmetric wiretap channel W\ : Xi — > Z is symmetrized to channel W\ : Xi —> (Z, Xi ©Xi). After 
the N-by-N polarization transform, we obtain wj l:N> : U’, —> (L)J :l_1 , 7\ N ^ and W^’ N ' 1 : 0^ —> (U} : * _1 ,X^ © 
X^l ■ Z' v l). The next lemma shows that if we symmetrize directly, i.e., construct a symmetric channel 

: Uj —> (L)J : * _1 , 7\ N \ © U|) in the sense of Lemma[8] is degraded with respect to w[ l ’ N \ 

Lemma 12: The symmetrized channel derived directly from is degraded with respect to the i-th 

subchannel W < {’~ N> of W\. 
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Proof: According to the proof of 1361 Theorem 2], we have the relationship 

w[- N \ur-\x^ ® = 2 _JV+1 P u i:i )Z[ jv 1 2 [JV] ). 


Letting x\ © x[ = 0^, the equation becomes W\u\ :z 1 ,0^, |u|) = 2 Ar+1 P u i : * Z [iv] z^l), 

which has already been addressed in ll36l . However, for a fixed and u\ = u \, since Gn is full rank, there are 


2 iv_i choices of aj^ ,J remaining, which means that there exists 2 iv_1 outputs symbols of w[ l ’ N) having the same 
transition probability 2~ Ar+1 P u i:i Z [jvj z^). Suppose a middle channel which maps all these output symbols 
to one single symbol, which is with transition probability Pup* Z [jv] z^). The same operation can be done 
for u\ = u\ ® 1, making another symbol with transition probability Puj^.zi"] i u i Z i z^) corresponding to the input 
u\ ® 1. This is a channel degradation process, and the degraded channel is symmetric. 

Then we show that the symmetrized channel W[ l ' N> is equivalent to the degraded channel mentioned above. By 
Lemma [8j the channel transition probability of w[ l ’ N ^ is 


®u[,zW\u[) = Pu^z^V^ 1 ), 


which is equal to the transition probability of the degraded channel discussed in the previous paragraph. Therefore, 
is degraded with respect to w[ l ’ N \ □ 

Remark 6: In fact, a stronger relationship that is equivalent to w\ l ' N> can be proved. This is because 

that the output symbols combined in the channel degradation process have the same LR. An evidence of this result 
can be found in J36J Equation (36)], where Z(w[ l ' N ' > ) = Z( Ui|U] :i_1 , Z^l) = Z(Wi' N ' > ). Nevertheless, the 
degradation relationship is sufficient for this work. Notice that Lemma fl2l can be generalized to high level £, with 
outputs ZM replaced by (Z^l,X^ ] _i)- 

Illuminated by Lemma [12] we can also symmetrize the new induced channel at level £ and show that it is 
degraded with respect to the randomness-induced channel constructed from W(. For simplicity, letting i = 1 , the 
new induced channel at level 1 is Qn{W i, Si, 7Zi) : U (' s i u7i i) z v l, which is symmetrized to Qn(W\, Si, Pi) : 
q(SiU'R.-i) (Z^l, 0( SlUKl ) QyfiSiUTCi) ) in the same fashion as in Lemma[8] Recall that the randomness-induced 
channel of W\ defined in J8) can be denoted as Qn(Wi,1Zi USi) : (Z^^X^ ® X^). Note that 

for the randomness-induced channel Qn{Wi,1Zi U Si), set 1Z\ U Si is fed with uniformly random bits, which is 
different from the shaping-induced channel. 

Lemma 13: For an asymmetric channel W\ : X | —>• Z and its symmetrized channel W\ : Xi —» (Z,Xi ® Xi), 
the symmetrized version of the new induced channel Qjv(Wi,Si,7Zi) is degraded with respect to the randomness- 
induced channel Qn(Wi,1Zi USi). 

Proof: The proof is similar to that of Lemma fl2l For a fixed realization and input u f ' uK| ■* , there are 
2 |SiU7?.i| choice 0 f remaining. Since z^ is only dependent on z^, we can build a middle channel which 
merges the 2l‘ SlUKl l output symbols of QivfWi, 1Z± USi) to one output symbol of Qn{Wi, Si, 1Z±), which means 
that Qn(Wi, Si, IZi) is degraded with respect to Qn{Wi,1ZiUSi). Again, this result can be generalized to higher 
levels. □ 
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Finally, we are ready to prove the semantic security of our wiretap coding scheme. For brevity, let and 

MfF^> denote U^ u7 ^ and ; respectively. Recall that M is divided into Mi,...,M r at each level. We 

express M F and M F as the collection of message and frozen bits on all levels of the new induced channel and the 
symmetric randomness-induced channel, respectively. We also define MF©MF as the operation MfFf ® M«F( from 
level 1 to level r. 

Theorem 4 (Semantic security): For arbitrarily distributed message M, the information leakage /(M; Zl lV l) of the 
proposed wiretap lattice code is upper-bounded as 

I(M;Z [ArI ) < /(MF;Z [Ar] ,MF© MF) < rN2~ Nf> ', 

where /(IMF; Zl A l, IMF © MF) is the capacity of the symmetrized channel derived from the non-binary channel 
MF —> ZI*B 

Proof: By JSJ Proposition 16], the channel capacity of the randomness-induced channel Q,v( W\ - S\ /R, \) 
is upper-bounded by N2~ nP when partition rule (0J is used. By channel degradation, the channel capacity 
of the symmetrized new induced channel Qn(W\, Si, IZi) can also be upper-bounded by N2~ n ^ . Since this 
result can be generalized to higher level £ (£ > 1), we obtain C(Q.N(W(,St,lZf)) < N2 ~ n@ , which means 
/(MfF^; Zl j¥ l, X^_i. MrFf © M/F^) < N2~ Nf> . Similarly to (IT5] >. we have 

J(MF; Z^ N \ MF © MF) 

r 

= ^/(M £ F,;Z[ Ar ],MFffiMF|M 1: ^_ 1 F 1: ,_ 1 ) 
e=i 

r 

= Y H( M/F/|Mi:/_iFi:/_i) - H{ M/F/|Z^, IMF © MF, M^-rF^-i) 

1=1 

r 

< Y - H(M e F e \Z^ N \ MF © MF, IM^F^!) 

1=1 

r 

= WF /; Z^, MF © MF, M w _iF w _i) 

1=1 

r 

= Y /(M/F<; Z™, M e F e © M f F<) 

t= i 

< £/(M,F /; zM M/F< © M f F f ) 

< rN2~ N>> ', 

where equality (a) holds because Z^ is determined by MFR and MfFf is independent of M^ + \ :r F^\- r © 
M^ + i :r F^ + i :r , and inequality (6) holds because adding more variables will not decrease the mutual information. 

11 The symmetrization of a non-binary channel is similar to that of a binary channel as shown in Lemma[8] When X and X are both non-binary, 
X ® X denotes the result of the exclusive or (xor) operation of the binary expressions of X and X. 
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Therefore, we have 


/(M;Z[ W 1) < /(MFjZ^l) 

< H {MF ® MF) - iT(MF) + J(MF; Z [Ar] ) 

= /(MF;Z [Ar l, MF© MF) 

< rN2~ Nf> ', 

where the equality in (a) holds iff MF is also uniform, and ( 6 ) is due to the chain rule. 


□ 


E. Reliability 

The reliability analysis in Sect. IIII-DI holds for the wiretap coding without shaping. When shaping is involved, 
the problematic set Dp at each level is included in the shaping set Sp and hence determined by the random mapping 
$ 5 ^. In this subsection, we propose two decoders to achieve reliability for the shaping case. The first one requires 
a private link between Alice and Bob to share the random mapping and the second one uses the Markov block 
coding technique J9] without sharing the random mapping. Note that in j 6 ), the message is simply recovered by 
decoding the fine lattice A/,. When instantiated with a polar lattice, the existence of problematic set Dp does not 
permit decoding in this straightforward way. Yet, this is only a limitation of SC decoding, not that of the proposed 
coding scheme^. 

Decoder 1: If <$>s e is secretly shared between Alice and Bob, the bits in 'Dp can be recovered by Bob simply 
by the shared mapping. By Theorem [3| the reliability at each level can be guaranteed by uniformly distributed 


independent frozen bits and a random mapping $ 5 ^ according to P 


uj|uj 


..Y 


[JV] 


at each level. The decoding rule 


is given as follows. 

• Decoding: The decoder receives y^ N 1 and estimates based on the previously recovered according 

to the rule 

u\, if i £ Tp 

if i£Sp 

argmax P . i^-i [iV ] [N] {u\u\^ 1 , 2/ [JV] )> if i e le 

Note that probability | Ul;4 _! x [iv] y[JV] x^_ 1: y^) can be calculated by the SC decoding algorithm 

efficiently, treating Y and X|./>_ j (already decoded by the SC decoder at previous levels) as the outputs of the 
asymmetric channel. As a result, the expectation of the decoding error probability over the randomized mappings 
satisfies E& St [Pe^sJ] = 0{2~ N? ) for any j3' < (3 < 0.5. 

Consequently, by the union bound for multilevel decoding, the expected block error probability of our wiretap 
coding scheme vanishes as N —> 00 . However, this decoder requires the mapping is only shared between Alice 

12 In fact, the simulation in (8) showed that the unpolarized set is often empty for reasonable parameters. Even if it is not empty, its proportion 
is vanishing, and one can do some enumeration, list decoding etc. 
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and Bob. To share this mapping, we can let Alice and Bob have access to the same source of randomness. This 
means that we may need a private link between Alice and Bob before the described wiretap coding. Fortunately, 
the rate of this private link can be made vanishing since the proportion of the shaping bits covered by the mapping 
can be significantly reduced. 

Recall that the shaping set St is defined by 

S e = {i G [N] : ^(UjlUj^.X^i) < 1 - 2"*" or 2~ Nt> < Z(U^|U] :i “\ Y^x!*-!) < 1 - 2“ JV ' 3 }. (25) 

It has been shown in lf36l that the shaping bits in the subset {i £ [TV ] : Z(\)\ X^_i) < 2~ nP } can be 
recovered according to the rule 


uj = ar^aaxP u j |u i ! i-i iX i ; w i (ti|uJ :< 1 ,x 1 1 J_ 1 ) if 1 ,X^ 1 _i)<2 

instead of using the mapping. This modification does not change the result of Theorem [3 and a proof can be found 
in Ii40l and ED- As a result, the mapping only has to cover the unpolarized set 

dSf. = {i£ [TV] : 2~ nP < Z( UjlUj^.X^!) < 1 - 2~ nP or 
2"*" < Z{ U}I,Y 1 :JV ,XilJLi) <1-2-*"}, 


whose proportion —> 0 as N —> oo. 

Remark 7: When is shared with Bob, the decoding of A*> is equivalent to MMSE lattice decoding proposed 
in a More precisely, by l fl5l Lemma 7], SC decoding of an asymmetric channel can be converted to that of its 
symmetrized channel, which is equivalent to an MMSE-scaled channel for lattice Gaussian shaping E2 Lemma 
9]. 

Decoder 2: Alternatively, one can also use the block Markov coding technique [|9] to achieve reliability without 
sharing . As shown in Fig. [ 8 ] the message at £-th level is divided into kt blocks. Denote by AS^ the bits in 
unpolarized set (ISt. The shaping bits S/ for each block is further divided into unpolarized bits ASe and polarized 
shaping bits S/; \ AS^. As mentioned above, only AS^ needs to be covered by mapping and its proportion is 
vanishing. We can sacrifice some message bits to convey ASf for the next block without involving significant rate 
loss. These wasted message bits are denoted by E^. For encoding, we start with the last block (Block he). Given Fe, 
Mf (no Ef for the last block) and Rf, we can obtain AS^ according to $ 5 ^,. Then we copy ASr of the last block 
to the bits E( of its previous block and do encoding to get the AS^ of block he — 1. This process ends until we get 
the AS/; of the first block. This scheme is similar to the one we discussed in Sect. IIII-DI To achieve reliability, we 
need a secure code with vanishing rate to convey the bits ASf of the first block to Bob. See l42l for an example of 
such codes. To guarantee an insignificant rate loss, kg is required to be sufficiently large. We may set kf = 0(N a ) 
for some a > 0 . 

We also note that it is easy to satisfy the reliability condition when the message bits are not uniformly distributed. 
Using some additional shared randomness (which can be public), Alice can generate an uniformly random binary 
sequence M[ which has the same length of M, and share it with Bob. Instead of encoding M/> directly, Alice treats 
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frozen bits 


F f 


message bits 


M f \ Ef 


random bits 
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E, 

Re 




R f 
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Sf \ ASf 


Sf \ ASf 


Block 1 


Block 2 


Block kg 


Fig. 8. Markov block coding scheme without sharing the secret mapping. 


© Mf as the message, which is now uniform. Clearly, can be reliably decoded by both Decoder 1 and 

Decoder 2. Therefore, Mf can be recovered since Bob knows M^. 

Now we present the main theorem of the paper. 


Theorem 5 (Achieving secrecy capacity of the GWC): Consider a multilevel lattice code constructed from polar 
codes based on asymmetric channels and lattice Gaussian shaping D\^ s . Given a 2 > a 2 , let e\ (<r e ) be negligible 
and set the number of levels r = Oflog log N) for N —> oo. Then all strong secrecy rates R satisfying R < 
5 log ^ iif snr 5 ) are achievable for the Gaussian wiretap channel under semantic security, where SNRf, and SNR e 
denote the SNR of the main channel and wiretapper’s channel, respectively. 

Proof: The reliability condition and the strong secrecy condition are satisfied by Theorem [3] and Lemma QT| 
respectively. It remains to illustrate that the secrecy rate approaches the secrecy capacity. For some <f —y 0, we have 


lim R = f^ lim ^ 

N—foo ' ^ N—too N 

e =i 

r 

= /(Xf; Y|X 1; • • Xf_!) - /(Xf; Z|X 1; ■ • •, Xf_ x ) 

e=i 


(26) 


(b) 1 

> 2 


l + SNR t 


^l + SNRe 

where (a) is due to Lemma [TO] and (6) is because the signal power P s < cr 2 by Lemma [III 1 I. respectively. □ 


V. Discussion 

We would like to elucidate our coding scheme for the Gaussian wiretap channel in terms of the lattice structure. In 
Sect. ED we constructed the AWGN-good lattice A& and the secrecy-good lattice A e without considering the power 
constraint. When the power constraint is taken into consideration, the lattice Gaussian shaping was implemented in 

13 Of course, R cannot exceed the secrecy capacity, so this inequality implies that P s is very close to erf 
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Sect. El A;, and A e were then constructed according to the MMSE-scaled main channel and wiretapper’s channel, 
respectively. We note that these two lattices themselves are generated only if the independent frozen bits on all 
levels are Os. Since the independent frozen set of the polar codes at each level is filled with random bits, we actually 
obtain a coset Aft + x of At and a coset A e + x of A e simultaneously, where y is a uniformly distributed shift. 
This is because we can not fix the independent frozen bits F( in our scheme (due to the lack of the proof that the 
shaping-induced channel is symmetric). By using the lattice Gaussian IJ \as our constellation in each lattice 
dimension, we would obtain D A ,\ without coding. Since A e + \ C A/, + \ C A N , we actually implemented the 
lattice Gaussian shaping over both Aft + x and A e + %. To summarize our coding scheme, Alice firstly assigns each 
message to G Ad to a coset X m £ Aft/A e , then randomly sends a point in the coset A e + % + A m (A m is the coset 
leader of A m ) according to the distribution If\,,+ x +\ m .n : , via the shaping operation. This scheme is consistent with 
the theoretical model proposed in Ifi) . 

For semantic security, a symmetrized new induced channel from MF to (Z^, MF ® MF) was constructed to 
upper-bound the information leakage. This channel is directly derived from the new induced channel from MF to 
Z x . According to Lemma [12] this symmetrized new induced channel is degraded with respect to the symmetric 
randomness-induced channel from MF to (Z^, ®X^). Moreover, when F is frozen, the randomness-induced 

channel from M to (Z^^X^ ®X^) corresponds to the Aft/A e channel given in Sect. [HI] (with MMSE scaling). 


Appendix A 
Proof of Lemma[3] 

Proof: It is sufficient to show /(MFjZ^l) < N ■ 2~ N> 3 since /(M;Z[ j¥ 1) < /(MFjZ^). As has been 
shown in |8l, the induced channel MF —► Z x is symmetric when B and V are fed with random bits R. For a 
symmetric channel, the maximum mutual information is achieved by uniform input distribution. Let U _4 and lie 
denote independent and uniform versions of M and F and Z x be the corresponding channel output. Assuming 
i\ <12 < ■■■ < i\Auc\ are the indices in A UC, 


I( MFjZM) < 


< 


/(0- A U c ;Z[ Ar ]) 

\AuC\ 

J2 -T(U^ ; Z^lO* 1 ,O^- 1 ) 

j =i 

\AuC\ 

/(u^'jZ^,O* 1 ,O^- 1 ) 

3=1 

\AUC\ 

3=1 

\auc\ 

Y < At- 2~ nP '. 

3=1 


□ 
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Appendix B 
Proof of Lemma |4] 

Proof: According to the definitions of G(V) and Af(W) presented in (0J, 

lim i= lim h{i : Z(V^) < 2~ N '}\ = C(V), 

N—foo Jy N—foo iV 

lim = lim h{i:Z(W$)> 1 — 2~ Nfi }\ =1-C(W). 

N—too iv N—foo iv 

Here we define another two sets G(V) and J\f(W) as 

G{V) = {i:Z(V$ ) )> 1-2-^}, 

J\f(W) = {i : Z(W$) < 2~ Nf> }. 

Similarly, we have linijv_>, 00 = 1 — C(V) and limjv->oo — = C(W). Since W is stochastically degra 
with respect to V, G(V) and N(W) are disjoint with each other |32]| . then we have 

lim \mumw)\ =1 _ c( v )+C (w). 

N—foo 

By the property of polarization, the proportion of the unpolarized part is vanishing as N goes to infinity, i.e., 

lim = 1, 


N—foo 


N 


|A/W)UAf(WQ| = 

N 


Finally, we have 


N—fOO 


iim vmrfm = i _ = c&) _ C(W) 

N—too N N—tcc N 


□ 


Appendix C 
Proof of Lemma[6] 

Proof: It is sufficient to demonstrate that channel W(A^_i/Af, af) is degraded with respect to W'fZi; Z|Xi^_i) 
and VP'(Xf; Z|Xi : £_i) is degraded with respect to IT(A^_i/Af, of) as well. To see this, we firstly construct a 
middle channel W from Z £ V(A r ) to Z € V(A^). For a specific realization z of Z, this W maps z + [Af/A r ] to 
z with probability 1, where [A^/A r ] represents the set of the coset leaders of the partition Ae/A r . Then we obtain 
channel W(Af_i/Af, af) by concatenating IT'(X^ ; Z|Xi : ^_i) and W, which means W(Af_i/Af, af) is degraded 
to FF'(X^; Z|Xi : ^_i). Similarly, we can also construct a middle channel IT from Z to Z. For a specific realization 
z of Z, this W maps z to z + [Af/A r ] with probability | A ^ A | , where |Af/A r | is the order of this partition. This 
means that W' Z|X 1 .^„ 1 ) is also degraded to W(Ai_x/At,cf^). 
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By channel degradation and lf33l Lemma 1], letting channel W and W' denote W {, of) and VF'(X^; Z|Xi : ^_i) 
for short, we have 

Z{w$) < Z{W' ( f) and Z(W$) > Z(W'\'$), 

I(WP) < I(W'$) and I(W$>) > 

meaning that Z{W$) = Z{W ,( f) and I{W ( f) = I{W' ( f). □ 
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